Are Large Language Models Capable of Deep Relational Reasoning? Insights from DeepSeek-R1 and Benchmark Comparisons
Chi Chiu So, Yueyue Sun, Jun-Min Wang, Siu Pang Yung, Anthony Wai Keung Loh, Chun Pong Chau
https://arxiv.org/abs/2506.23128
Investigating the Reasonable Effectiveness of Speaker Pre-Trained Models and their Synergistic Power for SingMOS Prediction
Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Swarup Ranjan Behera, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma
https://arxiv.org/abs/2506.02232…
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
Shuai Wang, Zhenhua Liu, Jiaheng Wei, Xuanwu Yin, Dong Li, Emad Barsoum
https://arxiv.org/abs/2506.09532
InsertRank: LLMs can reason over BM25 scores to Improve Listwise Reranking
Rahul Seetharaman, Kaustubh D. Dhole, Aman Bansal
https://arxiv.org/abs/2506.14086
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Chenchen Yuan, Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
https://arxiv.org/abs/2506.14625
“CVE naming and assignment to software packages and versions are the foundation upon which the software vulnerability ecosystem is based. Without it, we can’t track newly discovered vulnerabilities. We can’t score their severity or predict their exploitation. And we certainly wouldn’t be able to make the best decisions regarding patching them.”
A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs
Benno Krojer, Mojtaba Komeili, Candace Ross, Quentin Garrido, Koustuv Sinha, Nicolas Ballas, Mahmoud Assran
https://arxiv.org/abs/2506.09987
MCTS-Refined CoT: High-Quality Fine-Tuning Data for LLM-Based Repository Issue Resolution
Yibo Wang, Zhihao Peng, Ying Wang, Zhao Wei, Hai Yu, Zhiliang Zhu
https://arxiv.org/abs/2506.12728
Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
Wuwei Zhang, Fangcong Yin, Howard Yen, Danqi Chen, Xi Ye
https://arxiv.org/abs/2506.09944